Partially lexicalized parsing model utilizing rich features
نویسندگان
چکیده
In this paper, we propose a partially lexicalized parsing model utilizing rich features to improve the parsing ability and reduce the parsing cost. In order to disambiguate parse trees effectively, it employs several useful features such as a syntactic label feature, a content feature, a functional feature, and a size feature. Besides, it is partially lexicalized so as to reduce the parsing cost closely connected with lexical information. Moreover, it is designed to be suitable for representing word order variation and constituent ellipsis in Korean sentences. Experimental results show that the proposed parsing model using more features performs better although it less depends on lexical information.
منابع مشابه
Scalable Discriminative Parsing for German
Generative lexicalized parsing models, which are the mainstay for probabilistic parsing of English, do not perform as well when applied to languages with different language-specific properties such as free(r) word order or rich morphology. For German and other non-English languages, linguistically motivated complex treebank transformations have been shown to improve performance within the frame...
متن کاملMultilingual discriminative lexicalized phrase structure parsing
We provide a generalization of discriminative lexicalized shift reduce parsing techniques for phrase structure grammar to a wide range of morphologically rich languages. The model is efficient and outperforms recent strong baselines on almost all languages considered. It takes advantage of a dependency based modelling of morphology and a shallow modelling of constituency boundaries.
متن کاملTOWARDS EFFICIENT STATISTICAL PARSING USING LEXICALIZED GRAMMATICAL INFORMATION by
For a long time, the goal of wide-coverage natural language parsers had remained elusive. Much progress has been made recently, however, with the development of lexicalized statistical models of natural language parsing. Although lexicalized tree adjoining grammar (TAG) is a lexicalized grammatical formalism whose development predates these recent advances, its application in lexicalized statis...
متن کاملCross Parser Evaluation and Tagset Variation : a French Treebank Study
This paper presents preliminary investigations on the statistical parsing of French by bringing a complete evaluation on French data of the main probabilistic lexicalized and unlexicalized parsers first designed on the Penn Treebank. We adapted the parsers on the two existing treebanks of French (Abeillé et al., 2003; Schluter and van Genabith, 2007). To our knowledge, mostly all of the results...
متن کاملCross parser evaluation : a French Treebanks study
This paper presents preliminary investigations on the statistical parsing of French by bringing a complete evaluation on French data of the main probabilistic lexicalized and unlexicalized parsers first designed on the Penn Treebank. We adapted the parsers on the two existing treebanks of French (Abeillé et al., 2003; Schluter and van Genabith, 2007). To our knowledge, mostly all of the results...
متن کامل